Complexity and Irregularity in the Lexicon
McGill University, Mila, Canada CIFAR AI Chair
he/him
SCiL 2024
June 28, 2024
$$
As a language increases in complexity in one area, another must decrease in complexity to compensate.
Objective measurement is difficult, but impressionistically it would seem that the total grammatical complexity of any language, counting both morphology and syntax, is about the same as that of any other. This is not surprising, since all languages have about equally complex jobs to do, and what is not done morphologically has to be done syntactically.
– Hockett 1958, “A Course in Modern Linguistics”
…But what if these are both correlated with something else, like frequency?
How do we define a trade-off when more than two variables are involved?
Our question: Does a trade-off between phonotactic complexity and morphological irregularity exist in a larger set of languages? Is it universal?
Imagine a language where word length is a mediator in the relationship between frequency and phonotactic complexity.
\[ \begin{align} FR &\sim \mathcal{N}(\mu = 2, \sigma = 1)\\ WL &\sim FR + \mathcal{N}(\mu = 0, \sigma = .2)\\ PC &\sim WL + \mathcal{N}(\mu = 0, \sigma = .2)\\ \end{align} \]
\(\rho\) = 0.9814317
\(\rho\) = 0.9642839
\(\rho\) = 0.9829583
If frequency and phonotactic complexity are common causes1 of word length, how does frequency effect phonotactic complexity?
\[ \begin{align} FR &\sim \mathcal{N}(\mu = 2, \sigma = 1)\\ PC &\sim \mathcal{N}(\mu = 0, \sigma = .2)\\ WL &\sim FR + PC + \mathcal{N}(\mu = 0, \sigma = .2)\\ \end{align} \]
| Pred. | Est. | CI | p |
| (Intercept) | 0.00 | -0.01 – 0.02 | 0.652 |
| FR | -0.50 | -0.53 – -0.47 | <0.001 |
| WL | 0.50 | 0.46 – 0.53 | <0.001 |
FR has a negative effect on PC, even though PC is genrated independently of FR.
| Pred. | Est. | CI | p |
| (Intercept) | 0.01 | -0.02 – 0.04 | 0.470 |
| FR | -0.00 | -0.02 – 0.01 | 0.568 |
FR has 0 effect on PC, as expected!
- corr.
0 corr.
+ corr.
- corr.
- corr.
+ corr.
0 corr.
- corr.
- corr.
- corr.
\[ \text{MI}(w, \ell, \sigma) = -\log \frac{p( {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}w}}}} \mid {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\ell}}}}, {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\sigma}}}}, {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\mathcal{L}_{-\boldsymbol\ell}}}}} )} { 1 - p(w\mid \ell, \sigma, \mathcal{L}_{-\boldsymbol\ell}) } \]
\[ \text{MI}(w, \ell, \sigma) = -\log \frac{p( {\color{mathred}\overbrace{\vphantom{(A)}{\color{mathred}w}}} \mid {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\ell}}}}, {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\sigma}}}}, {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\mathcal{L}_{-\boldsymbol\ell}}}}} )} { 1 - p(w\mid \ell, \sigma, \mathcal{L}_{-\boldsymbol\ell}) } \] word
\[ \text{MI}(w, \ell, \sigma) = -\log \frac{p( {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}w}}}} \mid {\color{mathred}\overbrace{\vphantom{(A)}{\color{mathred}\ell}}}, {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\sigma}}}}, {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\mathcal{L}_{-\boldsymbol\ell}}}}} )} { 1 - p(w\mid \ell, \sigma, \mathcal{L}_{-\boldsymbol\ell}) } \] lemma
\[ \text{MI}(w, \ell, \sigma) = -\log \frac{p( {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}w}}}} \mid {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\ell}}}}, {\color{mathred}\overbrace{\vphantom{(A)}{\color{mathred}\sigma}}}, {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\mathcal{L}_{-\boldsymbol\ell}}}}} )} { 1 - p(w\mid \ell, \sigma, \mathcal{L}_{-\boldsymbol\ell}) } \] slot (i.e. PAST, SINGULAR)
\[ \text{MI}(w, \ell, \sigma) = -\log \frac{p( {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}w}}}} \mid {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\ell}}}}, {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\sigma}}}}, {\color{mathred}\overbrace{\vphantom{(A)}{\color{mathred}\mathcal{L}_{-\boldsymbol\ell}}}} )} { 1 - p(w\mid \ell, \sigma, \mathcal{L}_{-\boldsymbol\ell}) } \] lexicon (with target lemma removed)
\[ \text{MI}(w, \ell, \sigma) = -\log \frac{p( {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}w}}}} \mid {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\ell}}}}, {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\sigma}}}}, {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\mathcal{L}_{-\boldsymbol\ell}}}}} )} { 1 - p(w\mid \ell, \sigma, \mathcal{L}_{-\boldsymbol\ell}) } \]
Example: \(w=\) walk, \(\ell=\) Walk, \(\sigma=\) [singular, past], \(\mathcal{L}_{-\boldsymbol\ell} =\) English-Walk
\[ \text{MI}(\ell) = {\color{tr}\underbrace{{\color{black}\frac{1}{|\mathcal{S}|}}}} {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}{\sum_{\sigma\in \mathcal{S}} \text{MI}({\color{tr}\underbrace{{\color{black}\iota(\ell, \sigma)}}}, \ell, \sigma)}}}}} \]
\[ \text{MI}(\ell) = {\color{mathred}\underbrace{{\color{mathred}\frac{1}{|\mathcal{S}|}}}} {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\sum_{\sigma\in \mathcal{S}} \text{MI}({\color{tr}\underbrace{{\color{black}\iota(\ell, \sigma)}}}, \ell, \sigma)}}}} \] number of inflected forms associated with \(\ell\)
\[ \text{MI}(\ell) = {\color{tr}\underbrace{{\color{black}\frac{1}{|\mathcal{S}|}}}} {\color{mathred}\overbrace{\vphantom{(A)}{\color{mathred}\sum_{\sigma\in \mathcal{S}} \text{MI}({\color{tr}\underbrace{{\color{black}{\color{mathred}\iota(\ell, \sigma)}}}}, \ell, \sigma)}}} \] sum of \(\text{MI}\) of inflected forms associated with \(\ell\)
\[ \text{MI}(\ell) = {\color{tr}\underbrace{{\color{black}\frac{1}{|\mathcal{S}|}}}} {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\sum_{\sigma\in \mathcal{S}} \text{MI}({\color{mathred}\underbrace{{\color{mathred}\iota(\ell, \sigma)}}}, \ell, \sigma)}}}} \] lemma inflected with morphological features \(\sigma\) (a word \(w\))
\[ \text{PC}(w) = - \frac{\log p({\color{tr}\overbrace{\vphantom{(A)}{{\color{black}w}}}} \mid {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\mathcal{L}_{-w}}}}})}{{\color{tr}\underbrace{{\color{black}|w|}}}} \]
\[ \text{PC}(w) = - \frac{\log p({\color{mathred}\overbrace{\vphantom{(A)}{\color{mathred}w}}} \mid {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\mathcal{L}_{-w}}}}})}{{\color{tr}\underbrace{{\color{black}|w|}}}} \] word
\[ \text{PC}(w) = - \frac{\log p({\color{tr}\overbrace{\vphantom{(A)}{{\color{black}w}}}} \mid {\color{mathred}\overbrace{\vphantom{(A)}{\color{mathred}\mathcal{L}_{-w}}}})}{{\color{tr}\underbrace{{\color{black}|w|}}}} \] lexicon (with target word removed)
\[ \text{PC}(w) = - \frac{\log p({\color{tr}\overbrace{\vphantom{(A)}{{\color{black}w}}}} \mid {\color{tr}\overbrace{\vphantom{(A)}{{\color{black}\mathcal{L}_{-w}}}}})}{{\color{mathred}\underbrace{{\color{mathred}|w|}}}} \] length of word in phones
For each pair of variables, is there a relationship after controlling for variables noted in previous work?
Is there a relationship within an individual language?
Are languages that are more complex in one way less complex in another?
MI ~ PC + FR + WL
MI ~ PC + FR + WL + mean(PC) + mean(WL) + (1 + PC + FR + WL | language)
PC ~ WL
PC ~ WL + mean(WL) + (1 + WL | language)
MI ~ FR
PC ~ FR + WL
MI ~ PC + FR
MI ~ WL + FR + mean(WL) + (1 + FR + WL | language)
WL ~ FR
Word Length and Frequency
Phonotactic Complexity and Frequency
Morphological Irregularity and Frequency
Morphological Irregularity and Word Length
Phonotactic Complexity and Morphological Irregularity
Phonotactic Complexity and Word Length
Morphological Irregularity and Word Length
Phonotactic Complexity and Morphological Irregularity
Phonotactic Complexity and Word Length